Tradeoffs in subjective testing methods for image and video quality assessment
نویسندگان
چکیده
An objective quality estimator for either still images or video should accurately estimate the perceived quality scores of a collection of stimuli. New applications and processing techniques will introduce novel distortions that will need to be quantified in terms of perceived quality in order to confidently evaluate an objective quality estimator. The subjective testing method used to obtain the perceived quality scores affects the accuracy and reliability of the data collected. Two common methods used to collect perceived quality scores are absolute categorical rating (ACR) and subjective assessment methodology for video quality (SAMVIQ). The ACR test method presents stimuli in a random order and uses a coarse resolution rating scale for evaluation. The SAMVIQ test method allows the observer to freely view several stimuli multiple times and uses a fine resolution rating scale for evaluation. Ease of implementation typically influences the adoption of ACR over SAMVIQ, since ACR accommodates more stimuli per testing session. This paper investigates the tradeoffs of these two subjective testing methods using three different subjective databases that have perceived quality scores corresponding to the ACR and SAMVIQ test methods. Results are 1) the fine resolution rating scale used by SAMVIQ is superfluous, 2) SAMVIQ scores have greater accuracy than ACR scores for the same number of observers (on average 30% fewer observers were required for SAMVIQ than ACR for the same level of accuracy), 3) SAMVIQ scores better differentiate stimuli than ACR scores, and 4) the consistency of categorical ratings between ACR and SAMVIQ is lower for databases when stimuli are more difficult to distinguish in terms of perceived quality. Increasing the number of observers for ACR generates more accurate scores, competitive with the accuracy found with fewer observers using SAMVIQ. Despite the evidence promoting the use of SAMVIQ to obtain perceived quality scores, scores obtained with ACR predict those obtained with SAMVIQ when the stimuli are easier to distinguish in terms of perceived quality.
منابع مشابه
A Machine Learning Approach to No-Reference Objective Video Quality Assessment for High Definition Resources
The video quality assessment must be adapted to the human visual system, which is why researchers have performed subjective viewing experiments in order to obtain the conditions of encoding of video systems to provide the best quality to the user. The objective of this study is to assess the video quality using image features extraction without using reference video. RMSE values and processing ...
متن کاملAre Existing Procedures Enough? Image and Video Quality Assessment: Review of Subjective and Objective Metrics
Images and videos are subject to a wide variety of distortions during acquisition, digitizing, processing, restoration, compression, storage, transmission and reproduction, any of which may result in degradation in visual quality. That is why image quality assessment plays a major role in many image processing applications. Image and video quality metrics can be classified by using a number of ...
متن کاملA Comparison of Full-Reference Image Quality Assessment Methods
In this contribution, different image quality assessment methods are compared. Their basic principles are introduced and evaluation results are given. 1. Introduction Image and video quality measures play an important role in a variety of image and video processing applications. Very often the quality of an image needs to be quantified. This can be done by subjective testing sessions, or by obj...
متن کاملQuality Assessment of Turfgrasses Using NTEP Method Compared to an Image-Based Scoring System
The current methods of turfgrass evaluations are often based on human-based assessment methods. However, eliminating subjective errors from such evaluations is often impossible. This research compared the accuracy of human-based and digital image processing-based methods for quality assessment of turfgrasses. Four turfgrass plots were evaluated using the two mentioned methods. In the human-base...
متن کاملQuality Assessment of Turfgrasses Using NTEP Method Compared to an Image-Based Scoring System
The current methods of turfgrass evaluations are often based on human-based assessment methods. However, eliminating subjective errors from such evaluations is often impossible. This research compared the accuracy of human-based and digital image processing-based methods for quality assessment of turfgrasses. Four turfgrass plots were evaluated using the two mentioned methods. In the human-base...
متن کامل